Talk by Ian Goodfellow, University of Montreal. Given to the Redwood Center for Theoretical Neuroscience at UC Berkeley.
The traditional deep Boltzmann machine training algorithm requires a greedy layerwise pretraining phase. Existing techniques for avoiding greedy pretraining do not perform as well for classification as the layerwise method. I show that 2nd order methods applied to a deterministic training criterion can obtain better classification performance than the existing joint training methods.